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ABSTRACT 
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player  discovers  new  exploits  according  to  an  independent  random  process.  Upon 
discovery,  the  player  must  decide  if  and  when  to  exercise  a  munition  based  on  that 
exploit.  The  payoff  from  using  the  munition  is  a  function  of  time  that  is  (generally) 
increasing.  These  factors  create  a  basic  tension:  the  longer  a  player  waits  to  exercise  a 
munition,  the  greater  his  payoff  because  the  munition  is  more  mature,  but  also  the  greater 
the  chance  that  the  opponent  will  also  discover  the  exploit  and  nullify  the  munition. 
Assuming  perfect  knowledge,  and  under  mild  restrictions  on  the  time-dependent  payoff 
function  for  a  munition,  we  derive  optimal  exercise  strategies  and  quantify  the  value  of 
engaging  in  cyber  conflict.  Our  analysis  also  leads  to  high-level  insights  on  cyber 
conflict  strategy. 
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EXECUTIVE  SUMMARY 


Conflict  in  cyberspace  is  difficult  to  analyze;  methods  developed  for  other 
dimensions  of  conflict,  such  as  land  warfare,  war  at  sea,  and  missile  warfare,  do  not 
adequately  address  cyber  conflict.  A  characteristic  that  distinguishes  cyber  conflict  is  that 
actors  do  not  know  the  true  state  of  their  arsenal(s) — i.e.,  an  opponent  may  negate  a 
potential  attack  by  discovering  and  fixing  the  vulnerability  in  their  system;  they  may  do 
this  without  knowledge  of  their  adversary’s  intentions. 

Our  analysis  focuses  on  the  National  level,  with  decisions  and  actions  that  would 
be  available  to  a  Unified  Commander.  This  is  fundamentally  different  than  analyses  that 
are  focused  on  the  defense  of  a  specific  technological  system. 

In  this  report,  we  develop  a  rigorous  game-theoretic  description  of  two  players 
and  a  single  vulnerability.  We  do  so  under  an  assumption  of  perfect  information ,  in  the 
sense  that  as  soon  as  a  player  discovers  a  vulnerability  he  knows  with  certainty  if  the 
adversary  has  also  discovered  it.  We  consider  the  decisions  facing  a  Commander  with 
limited  resources  who  has  a  single  decision:  Upon  discovering  vulnerability,  he  may: 

•  Wait:  Waiting  increases  the  damage  of  a  munition  based  on  the 
vulnerability;  however,  it  also  risks  the  adversary’s  discovery  of  the 
vulnerability,  negating  the  munitions’  effectiveness. 

•  Attack:  Attacking  exercises  an  available  munition. 

From  these  assumptions,  we  gain  the  following  insights: 

•  Success  requires  rapid  action.  Our  model  shows  that  delays  in  taking 
action  reduce  the  chance  of  a  player’s  success  in  cyber  conflict.  Such 
delays  can  come  from  a  variety  of  sources,  including  bureaucratic  or 
command  restrictions.  A  byproduct  of  our  model  is  the  calculation  of  how 
proficient  a  player  must  be  in  other  areas  to  make  up  for  delays  in  taking 
action;  in  most  cases,  the  required  capability  is  unattainable.  The 
immediate  consequence  of  this  is  that  command  structures  in  cyberspace 
should  be  agile  with  the  correct  level  of  delegation  of  authority. 

•  Prospects  for  deterrence  in  cyber  conflict  may  be  limited.  The  ability  of 
players  to  deter  their  opponents  from  attacking  depends  on  an  assured 
second  strike.  In  cyber  conflict,  opposing  players  may  have  munitions 
based  on  the  same  exploit,  and  the  first  player  to  use  the  exploit  effectively 
removes  second  strike  munitions  from  the  opponent’s  arsenal. 
Complicating  factors  to  the  cyber  conflict  game,  such  as  an  inability  to 
identify  the  player  who  performed  a  cyber  attack,  or  a  player’s  ability  to 
respond  with  kinetic  munitions,  also  have  an  effect  on  deterrence  in 
cyber  conflict. 

The  framework  contained  herein  not  only  informs  the  decision  facing  a 
commander  in  conflict,  but  also  allows  for  exploratory  analysis — particularly  in  the 


xiii 


trade-offs  between  speed  of  detection  and  speed  of  attack  development.  Therefore,  this 
model  may  be  useful  for  both  cyber  warriors  and  budget  analysts. 
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I.  INTRODUCTION 


Conflict  in  Cyberspace,  or  cyber  conflict,  is  important  at  both  the  strategic  and 
tactical  levels.  In  this  paper,  we  consider  the  strategic  decisions  made  by  states  or  other 
groups  about  when  and  how  to  engage  in  cyber  conflict.  The  increasing  dependency  on 
interconnected  networks,  both  in  military  and  civilian  life,  means  that  little  is  beyond  the 
reach  of  cyberspace.  Cyberspace  plays  a  central  role  in  our  social,  economic,  and  civic 
welfare.  It  is,  therefore,  not  surprising  that  the  United  States  “has  identified  cyber  security 
as  one  of  the  most  serious  economic  and  national  security  challenges  we  face  as  a  nation” 
(National  Security  Council,  2010).  Consequently,  security  and  defense  in  cyberspace  has 
become  an  increasingly  large  part  of  the  defense  budget  (Stervstein,  2011). 

A  defining  characteristic  of  cyber  conflict  is  the  way  in  which  weapons  in 
cyberspace  are  discovered,  developed,  and  employed.  Players  search  for  mechanisms  that 
can  cause  cyber  systems  to  perfonn  in  ways  not  intended  in  their  original  design,  called 
exploits,  and,  once  found,  develop  them  into  one  or  more  cyber  munitions.  These 
munitions  can  then  be  used  as  part  of  a  cyber  attack.  In  searching  for  exploits  to  use 
against  an  adversary,  a  player  may  also  discover  flaws  in  their  own  system  and  decide  to 
patch  them  so  an  adversary  cannot  use  them.  Moreover,  a  player  could  develop  munitions 
based  on  an  exploit  that  the  adversary  independently  fixes,  thereby  making  the  munitions 
obsolete.  Thus,  collections  of  cyber  munitions,  or  arsenals,  are  dynamic  and  their 
effectiveness  depends  on  the  relative  state  of  knowledge  of  the  opponents. 

In  this  context,  apparently  simple  questions  such  as  “how  long  should  we  hold  a 
munition  in  development  before  using  it  in  an  attack?”  and  “how  should  we  allocate 
limited  resources  to  offense  versus  defense?”  require  novel,  analytical  models.  Moreover, 
the  dynamic  nature  of  cyber  weapons  development  and  obsolescence  makes  it  difficult  to 
assess  the  potency  of  an  arsenal;  this  is  true  for  assessing  our  own  arsenal  as  well  as  an 
arsenal  belonging  to  an  adversary.  Clear,  useful  analysis  at  the  national  level  is  important 
both  for  making  sound  future  investment  decisions  and  for  creating  informed  strategic 
and  policy  guidance. 

To  analyze  the  strategic  decisions  involved  in  cyber  conflict,  we  use  a  game 
theoretic  framework — we  view  cyber  warfare  as  a  game  consisting  of  attacks  that 
opposing  players  exercise  at  a  time  of  their  choosing.  Each  player  discovers,  develops, 
and  chooses  to  exercise  attacks  to  maximize  the  value  of  their  cyber  operations.  Our 
analysis  is  independent  of  specific  technologies,  and  does  not  assume  an  explicit  cyber 
system  or  exploit. 

Using  minimal  assumptions,  our  model  leads  to  two  fundamental  insights: 

•  Success  requires  rapid  action.  Our  model  shows  that  delays  in  taking 
action  reduce  the  chance  of  a  player’s  success  in  cyber  conflict.  Such 
delays  can  come  from  a  variety  of  sources,  including  bureaucratic  or 
command  restrictions.  A  byproduct  of  our  model  is  the  calculation  of  how 
proficient  a  player  must  be  in  other  areas  to  make  up  for  delays  in  taking 
action;  in  most  cases,  the  required  capability  is  unattainable.  The 
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immediate  consequence  of  this  is  that  command  structures  in  cyberspace 
should  be  agile  with  the  correct  level  of  delegation  of  authority. 

•  Prospects  for  deterrence  in  cyber  conflict  may  be  limited.  The  ability  of 
players  to  deter  their  opponents  from  attacking  depends  on  an  assured 
second  strike.  In  cyber  conflict,  opposing  players  may  have  munitions 
based  on  the  same  exploit,  and  the  first  player  to  use  the  exploit  effectively 
removes  second  strike  munitions  from  the  opponent’s  arsenal. 
Complicating  factors  to  the  cyber  conflict  game,  such  as  an  inability  to 
identify  the  player  who  performed  a  cyber  attack,  or  a  player’s  ability  to 
respond  with  kinetic  munitions,  also  have  an  effect  on  deterrence  in 
cyber  conflict. 
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II.  RELATED  WORK 


The  JASON  2010  report,  The  Science  of  Cyber-Security  (JASON,  2010), 
recommends  a  variety  of  analytic  approaches  and  suggests  borrowing  ideas  from  other 
sciences  such  as  physics,  cryptography,  and  biological  sciences,  including  epidemiology. 
The  JASON  report  introduces  a  two-player,  stationary,  discrete-time  model  called  the 
Forwarder’s  Dilemma  as  an  example  of  what  a  game -theoretic  analysis  might  look  like. 
This  game  considers  whether  an  administrator  should  forward  another  system’s  messages 
on  their  network  and  is  similar  both  in  format  and  solution  to  the  well-known  Prisoner’s 
Dilemma  (e.g.  Fudenberg  &  Tirole,  1991).  Lye  and  Wing  (2002)  and  Shen,  Chen,  Blasch, 
and  Tadda  (2007)  also  consider  cyber  attacks  in  the  context  of  a  game.  The  most 
comprehensive  survey  of  game  theory  and  cyberspace  is  by  Shiva,  Dasgupta,  and  Wu 
(2010).  They  develop  a  taxonomy  of  game  theoretic  models  with  two  broad  categories: 

•  Static  versus  Dynamic.  A  “one  shot”  cyber  conflict  game,  where  players 
choose  plans  of  action  and  then  execute  them  simultaneously,  is  a  static 
game.  A  cyber  conflict  game  with  multiple  stages  and  sequential  decisions 
is  a  dynamic  game. 

•  Available  Information.  Players  may  have  exact,  imperfect,  or  no 
knowledge  about  their  opponent’s  intentions  or  capabilities.  If  the  players 
know  the  actions  of  other  players  once  taken,  this  is  called  a  game  with 
perfect  information.  If  the  players  know  the  structure  of  the  game  and 
payoffs,  but  not  the  actions,  this  is  called  a  game  with  complete 
information.  Finally,  a  game  in  which  the  payoffs  evolve  in  time  in  a 
random  process  is  a  stochastic  game. 

While  game  theory  considers  both  cooperative  and  noncooperative  games,  work 
to  date  on  cyber  conflict  deals  only  with  noncooperative  games.  In  the  taxonomy  of  Shiva 
et  al.  (2010)  our  proposed  model  is  a  noncooperative,  dynamic,  stochastic  game  with 
perfect  information. 

The  previous  study  that  has  the  most  commonality  with  our  approach  is  that  of 
Lye  and  Wing  (2002).  They  consider  a  two-player,  stochastic  game  between  an  attacker 
and  administrator.  Their  model  is  at  the  machine  level;  it  focuses  on  an  attacker 
attempting  to  find  the  best  policy  among  a  portfolio  of  several  attacks  to  damage  a 
university  computer  network.  This  game  theoretic  model  of  Lye  and  Wing  maps  to  the 
tactical  level  of  conflict,  as  opposed  to  our  model  that  is  focused  at  the  strategic  level 
between  two  players  engaged  in  cyber  conflict. 

Our  work  differs  from  previous  work  by  abstracting  cyber  conflict  away  from 
individual  machines  and  instruction  sets  in  the  same  manner  that  Lanchester  equations 
(Washburn  &  Kress,  2009)  abstract  physical  conflict  away  from  soldiers  and  weapons. 
The  goal  of  this  paper  is  to  provide  a  foundation  from  which  to  build  more  complex 
models,  towards  the  ultimate  goal  of  integrating  the  cyber  domain  into  the  spectrum  of 
conflict  analysis  in  order  to  support  strategic  models  for  decision  makers  at  the 
national  level. 
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III.  ANALYSIS 


A.  FOUNDATION 

As  defined  previously,  a  computer  system  may  contain  exploits.  These  are 
unknown  until  discovered,  after  which  they  can  be  fixed  in  the  form  of  a  patch  or 
weaponized  into  a  munition.  We  model  the  life-cycle  of  a  single  cyber  exploit  as  a 
four-stage  process. 

1.  Discovery  of  the  Exploit 

We  model  the  discovery  of  a  single  exploit  by  each  player  as  a  random  process, 
occurring  independently  for  each  player,  which  may  depend  on  factors  such  as  training, 
investment,  experience  and  luck. 

2.  Development  of  Munition 

Once  an  exploit  is  discovered,  a  player  can  develop  a  munition  based  on  the 
exploit.  We  assume  that  there  is  a  relationship  between  the  length  of  time  that  a  player 
knows  about  an  exploit  and  the  effectiveness  of  the  munition  he  develops  based  on  that 
exploit.  Munitions  may  only  be  developed  for  known  exploits. 

3.  Employment 

Once  a  munition  is  developed,  it  can  be  employed  at  will  against  an  adversary  in 
an  attack. 

4.  Obsolescence 

Consider  a  game  between  two  players,  Player  1  and  Player  2.  If  Player  1  discovers 
an  exploit  in  his  system  and  patches  it  before  Player  2  can  develop  and  employ  a 
munition  based  on  that  exploit,  then  that  munition  becomes  obsolete. 

Uncertainties  about  the  obsolescence  of  a  player’s  own  arsenal  are  a  key 
dimension  in  the  analysis  of  cyber  conflict.  For  the  purposes  of  this  analysis,  we  assume 
that  a  player  who  is  aware  of  an  exploit  also  knows  whether  the  other  player(s)  are  aware 
of  the  same  exploit;  this  removes  one  type  of  uncertainty.  For  a  player  who  is  unaware  of 
an  exploit,  we  assume  neither  player  knows  the  time  until  the  unaware  player  discovers 
the  exploit.  This  uncertainty  in  discovery  times  is  the  fundamental  tension  that  our  model 
seeks  to  explore. 

We  model  cyber  warfare  as  a  Markov  game  (Thie,  1983;  Fudenberg  &  Tirole, 
1991)  where  the  choices  available  to  each  player  depend  on  the  number  of  exploits 
known  by  each  player  and  the  strength  of  each  player’s  munitions.  In  general,  there  may 
be  multiple  exploits  that  each  player  discovers,  develops  into  munitions,  and  uses  to 
attack,  but  we  choose  to  focus  our  analysis  on  a  scenario  where  there  is  only  a  single 
exploit  to  be  discovered.  At  the  beginning  of  this  scenario,  neither  player  knows  the 
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exploit.  Each  player  probabilistically  discovers  the  exploit,  and  when  either  player 
chooses  to  attack,  then  payoffs  are  detennined  and  the  game  terminates. 

B.  FORMULATION 

Our  model  focuses  on  a  strategic  cyber  conflict  between  two  players,  where  there 
is  a  single  exploit  to  be  discovered.  Let  i  index  the  players  i  e  {1, 2} .  The  mathematical 
notation  used  to  describe  the  game  falls  into  three  broad  categories:  Discovery, 
Development,  and  Employment. 

1.  Discovery 

Let  T  be  the  duration  of  time  that  an  exploit  has  existed,  which  we  also  call  the 
clock  time.  Without  loss  of  generality,  we  assume  that  the  game  starts  when  the  exploit  is 
created.  We  create  a  discrete-time  model,  with  T  increasing  over  the  set  of  positive 
integers.  If  the  exploit  was  part  of  the  original  system,  then  T  is  the  age  of  the  system.  If 
the  exploit  was  introduced  as  part  of  a  software  upgrade,  then  T  is  the  age  of  the 
upgrade.  Let  dt  be  the  moment  in  clock  time  that  Player  i  discovers  the  exploit.  We 

define  r,  =max(0 ,T-dt)  to  be  the  relative  time  that  player  i  has  known  about  the 
exploit;  we  call  this  Player  V s  holding  time.  By  definition,  if  Player  i  is  not  aware  of  the 
exploit,  then  r,  =  0  .  We  define  a  state  of  the  cyber  game,  S ,  as: 

S  =  (T,t1,t2) 

where  the  elements  of  this  three-tuple  represent  how  long  the  exploit  has  existed,  how 
long  Player  1  has  known  the  exploit,  and  how  long  Player  2  has  known  the  exploit, 
respectively. 

2.  Development 

A  player’s  success  in  cyber  conflict  depends  on  both  his  ability  to  discover 
exploits  and  his  ability  to  develop  effective  munitions.  We  assume  that  at  any  moment 
following  the  discovery  d , ,  Player  i  has  the  ability  to  create  and  deploy  a  perfectly 

effective  patch.  However,  we  assume  that  the  act  of  deploying  the  patch  effectively 
announces  it  to  the  adversary;  so  patching  nullifies  everyone’s  munitions  based  on  that 
exploit,  and  this  ends  the  game  for  both  sides.  Let  pt  (T)  denote  the  probability  that 
Player  i  discovers  an  exploit  as  clock  time  progresses  from  period  T  to  period  T  + 1 .  For 
convenience,  let  qt  .(T}=\-p. \T) .  Let  a,  (r,.)be  the  value  of  an  attack  by  Player  i  using 

a  munition  developed  using  a  holding  time  of  r, .  The  value  of  an  attack  is  a  function  of 

r  instead  of  T  because  we  assume  that  once  the  exploit  is  known,  the  effectiveness  of 
the  munition  depends  on  holding  time  and  not  clock  time.  We  impose  two  constraints  on 

a,  (r,  ) .  First,  we  assume  a.  (o)  =  0;  namely,  that  if  an  exploit  is  not  known,  then  an  attack 

based  on  it  has  no  value.  Additionally,  we  assume  0  <  a\  rj<  B.,  where  Bj  is  an  arbitrary 

upper  bound,  thus  disallowing  cyber  attacks  with  either  a  negative  value  or  an 
infinite  value. 
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3.  Employment 

Once  a  player  has  a  cyber  munition,  he  may  choose  to  use  it.  Let  6t(r)  denote 
the  action  set  of  Player  i  at  time  T.  We  define  (r)c:  {W,  A}  where: 

•  W :  Wait.  While  a  player  is  waiting,  he  is  either  waiting  to  discover  the 
exploit  (t.  =0)  or  he  may  know  about  the  exploit  (r.  >0)  and  be  working 
to  make  his  munitions  more  effective. 

•  A  :Attack.  When  a  player  attacks  he  receives  the  value  of  his  attack  at 
that  time.  Attacking  also  broadcasts  the  attack’s  underlying  exploit  to 
all  players. 

A  player  who  does  not  know  the  exploit  has  a  singleton  action  set,  {W} ,  and  a 
player  that  does  know  the  exploit  has  the  full  action  set,  { W,  Aj . 

C.  ZERO-SUM  GAME  WITH  PERFECT  INFORMATION 

To  fully  specify  the  game,  we  must  define  action  sets  for  each  player,  and  the 
utilities  for  player’s  actions.  We  assume  a  zero-sum,  strategic  conflict;  i.e.,  that  any  utility 
gain  by  one  player  results  in  an  equal  utility  loss  by  the  opponent.  We  use  the  convention 
that  Player  1  is  a  maximizing  player  and  Player  2  is  a  minimizing  player.  We  assume  that 
each  player  knows  the  state  of  the  Markov  game,  S.  But  this  perfect  information 
assumption  does  not  mean  that  a  player  knows  the  exploit.  A  player  is  still  limited  by  his 
action  set.  For  example,  if  the  state  of  the  game  is  (T,  1,0),  it  means  that:  Player  1  knows 

the  exploit,  has  a  holding  time  of  1,  and  has  an  action  set  of  {W,A} ;  while  Player  2  does 

not  know  the  exploit,  has  a  holding  time  of  0,  and  therefore  has  an  action  set  of 
solely  \  w)  ■ 

1.  Markov  Game  Transitions 

The  discovery  and  development  of  attacks  is  modeled  as  transitions  in  the  state  of 
the  Markov  game.  The  game  begins  in  the  state  (0,0,0)  and  proceeds  in  discrete  rounds. 

In  each  round,  the  clock  time  T  increases  deterministically.  For  each  Player  i,  the 
holding  time  T  =0  until  the  player  discovers  the  exploit.  Exploit  discovery  happens  with 

probability  p.  (r)  for  Player  i  in  round  T .  Once  an  exploit  is  discovered  by  a  player,  the 

player’s  holding  time  increases  deterministically.  The  resulting  transitions  of  the  Markov 
game  state  are  summarized  in  Table  1.  A  visual  depiction  of  the  states  of  the  game  is 
presented  in  Figure  1. 
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Table  1.  Markov  game  action  sets  and  state  transitions  as  a  function  of  (T,tvt2), 
the  state  of  the  game.  The  game  always  starts  in  ^7,0,0^.  As  Player  i  discovers  the 
exploit,  r, ,  becomes  greater  than  zero  and  Player  i  s  action  set  includes  attack. 


•  •  • 

(T,  0, 3}  l  I 

(TO,  2)  (T,  1,2)  (T,  2,2)  ••• 

\  /I  ^ 

(T,  0,1)  (T,  1,1)  (T,  2,1)  ••• 

t  />  S 

(T,  0, 0>-*-(T,  1, 0>-(T,  2, 0}-<T,  3,0)  •  •  • 

Figure  1.  Diagram  of  states  in  the  Markov  game.  The  arrows  in  the  diagram  show 
the  possible  transitions  from  one  state  to  another,  as  described  in  Table  1.  The  horizontal 
axis  describes  increases  in  holding  time  for  Player  1 ,  zx ,  and  the  vertical  axis  describes 

increases  in  holding  time  for  Player  2,  r2 . 


Let  define  the  value  of  the  game  in  state  this  value 

represents  the  expected  value  to  the  players  if  they  play  the  game  starting  at  that  state. 
Because  the  game  is  zero  sum,  payoffs  for  both  players  can  be  described  by  a  single 
value.  To  analyze  the  game,  we  seek  to  characterize  this  value  function.  In  particular, 
V  (0,0,0)  is  the  value  of  engaging  in  cyber  conflict.  We  seek  to  characterize  V  (r,rpr2) 


8 


for  every  state  of  the  Markov  game.  We  proceed  in  our  analysis  by  considering  three 
cases  on  rx ,  r2 . 

2.  Both  Players  Know  the  Exploit 

In  this  case,  we  have  xx  >  0,  r2  >  0  and  both  players  have  full  action  sets,  meaning 

each  may  attack  or  wait.  Table  2  represents  the  payoffs  of  the  Markov  game  in  such  a 
state  in  matrix  form.  Each  entry  in  the  matrix  contains  a  single  real  number,  since  the 
game  is  zero  sum.  If  both  players  wait,  the  value  is  detennined  by  future  play.  If  one 
player  attacks  and  the  other  waits,  the  attacking  player  receives  the  full  value  of  his 
munition.  If  both  players  attack  simultaneously,  the  sum  of  the  munition  values  gives  the 
result  of  the  game. 


Player  2  plays:  W 

Player  2  plays:  A 

Player  1  plays:  W 

V{T  +  \,tx+\,t2+\) 

-a2(r2) 

Player  1  plays:  A 

a|0|)-a2(r2) 

Table  2.  Payoff  matrix  for  the  Markov  game  when  both  players  know  the  exploit. 
The  payoff  associated  with  “Wait,  Wait”  depends  on  the  future  evolution  of  the  game. 

This  leads  to  the  following  observation. 

Theorem  1.  For  any  game  state  (T,tx,t2)  such  that  r,  >  0  and  r2  >  0  ,  “Attack,  Attack”  is 
an  iterated  elimination  of  dominated  strategies  equilibrium  with  a  value  of 

Proof.  Suppose  V  (T +  \,  rt  +1,  r2  +l)>0  .  Then  V(T  +  l,rl  +l,r2  +l)>-a2  (r2)  and 
a,  (r,  )>a,  (r, )  ~a2  (r2) .  Therefore,  “Attack”  is  a  dominating  strategy  for  Player  2.  Given 
Player  2  chooses  “Attack,”  Player  1  must  also  play  “Attack”  and  “Attack,  Attack”  is  an 
equilibrium.  A  symmetric  argument  holds  if  F(r  +  l,z-j +  l,r2 +l)<  0. 

□ 

Theorem  1  results  in  the  following  corollary. 

Corollary  1.  If  the  game  starts  in  state  (T,tx,t2),  with  tx  >  0  and  r2  >  0  ,  the  game 
tenninates  immediately  and 

V (T, T1,T2)  =  al  [rx )-a2[r2). 

Interpreting  the  results  of  Theorem  1  and  the  above  corollary,  a  game  starting  in 
(r,0,0),r  >  0  ends,  optimally,  no  later  than  when  one  of  the  following  states  is  reached: 

(r,l,r2)or  (r,Tj,l).  However,  the  game  may  also  end  earlier  if  a  player  who  discovers 
the  exploit  chooses  to  attack  before  the  second  player  has  discovered  the  exploit.  Because 
each  i,  a.  (•)  has  a  unique,  associated  rj ,  for  ease  of  exposition  we  drop  the  index  i  from 
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future  uses  of  r  .  For  the  remainder  of  this  paper,  statements  like  a2  ( r)  should  be 
understood  to  mean  a2  (r2) . 

3.  Only  One  Player  Knows  the  Exploit 

For  simplicity,  we  develop  the  theory  from  a  state  where  Player  1  has  the  exploit 
and  Player  2  does  not.  The  analysis  follows  identical  lines  in  the  opposing  situation.  In 
this  case,  Player  1  has  a  full  action  set  and  Player  2  may  only  wait  to  discover  the  exploit 
6X  ={A,W},02  ={W}.  Suppose  the  state  of  the  game  is  (T,t,0)  .  We  define 

Y  =  (1-p2(T))v{T  +  1,t1+1,0)  +  p2(T)v(T  +  1,t1+\,1) 

to  be  the  expected  utility  if  both  players  choose  to  wait  at  time  T.  Table  3  displays  the 
payoffs  in  matrix  form. 


Player  2  Plays:  Wait 

Player  1  Plays:  Wait 

Y 

Player  1  Plays:  Attack 

a,(r) 

Table  3.  Payoffs  for  the  case  where  Player  1  knows  the  exploit  and  Player  2 
does  not.  By  definition,  Player  2  has  a  singleton  action  set  and  the  matrix 
reduces  to  a  single  column.  Player  1  prefers  to  attack  if  Y<  a,  (r) . 

The  fundamental  analytic  question  is  “from  which  states  does  Player  1  prefer  to 
attack?’”  If  Player  2  discovers  the  exploit,  the  game  transitions  to  the  scenario  described 
previously  and  immediately  concludes  as  specified  in  Theorem  1.  We  characterize  states 
(T,t,  0)  from  which  Player  1  prefers  to  attack  as  follows.  We  define  vr  (/?)  as  the 
expected  utility  to  Player  1  if  he  waits  h  time  periods  before  attacking,  starting  in 
state  (T,t,  0). 

In  particular,  we  have: 

vr  (0)  =  <h(r) 

Vr{1)  =^2  (:r)  "l  (^  +  1)  +  /?2  +  1)-«2  (!)) 

vr  (2)  =q2{T  +  \)q2{T)ax{r  +  2)  +  p2{T  +  \)q2{T)[ax(r  +  2)-ci2(\))  + 

Pi  (^)'(ai  (r  +  1)-fl2 11)) 


h- 1 


vT[h)  =ax{T  +  h\Y\q2{T  +  k)  +  ^ 


k= 0 


£=0 


(fl1(r  +  k  +  l)-fl2(l))-p2(r  +  k)Y\q2{T  +  j) 


7=0 


(1) 
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The  definition  of  vr(7?)  allows  us  to  evaluate  the  states  from  which  Player  1 
prefers  to  attack.  Player  1  prefers  to  attack  rather  than  wait  in  state  (T,r,0j  if  and  only  if 
the  following  holds: 

ai  (r)  =  vr  (0)>vr  (/?)  for  all  h>  1.  (2) 

This  statement  mirrors  our  intuition  that  a  player  should  attack  only  if  an 
immediate  attack  results  in  a  higher  utility  than  waiting  for  any  number  of  turns 
before  attacking. 

Theorem  2.  If  aj(r)is  concave  and  nondecreasing,  and  p2  ( T)  is  nondecreasing,  then 
v,(0)2v,  (l)  implies  that  Player  1  should  attack  in  state  (T,t, 0)  (i.e.,  Player  1  can 
never  do  better  by  waiting). 

Proof.  We  proceed  by  showing  that  the  theorem  assumptions  imply  that 

vr(0)  >  vT(h )  for  all  h>  2  . 

Consider  the  quantity 

h 

vT(h  +  \)-vT(h)  -ax{z  +  h  +  \)Y\q2(T  +  k)~ 

k= 0 

h- 1  h- 1 

al(r  +  h)Y\q2(  T  +  k)  +  (ax(z  +  h  +  X)  —  a2(y))p2(T  +  +  j) 

k= 0  y=0 

h- 1 

=n«2(r  +  k)[n,(r  +  /2  +  l)-a1(r  +  A)-a2(l)/?2(r  +  h )]  . 

k= o 

We  know  that  vr(0)  >  vr(l) ,  which  implies  that 

0>vr(l)-vr(0) 

=  ax  (t  + 1)  -  a,  (t)  -  p2  ( T)a2  (1) 

>  ax  (r  +  h  + 1)  -  ax  (r  +  h)  -  p2  ( T)a2  (1), 

where  the  last  inequality  came  from  the  fact  that  aj(-)is  concave  and  nondecreasing. 
Continuing  with  the  last  expression  above,  we  have 

0  >  a^r  +  h  +  Y)  - a^r  +  h)  - p2(T)a2(l) 

>  ax(z  +  h  +  Y)-al(r  +  h)~ p2(T  +  h)a2{  1), 

where  the  last  inequality  came  from  the  fact  that  p2(-)  is  nondecreasing  and  a1(\)  is 
nonnegative.  Finally,  multiplying  both  sides  of  the  inequality  by  the  positive  number 

h-l 

[~[  q2(T  +  k) ,  gives 

k= 0 
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(3) 


+  k)[ax  (t  +  h  +  1)  -  ax(r  +  h)  -  p2(T  +  h)a2  (1)] 

k= 0 

=  vr(h  +  \)-vr(h). 

We  can  complete  the  proof  as  follows: 

vr(h)-vT(  0)  =  vr  (h)  -vT(h- 1)  + 

vT(h-l)-vT(h-2)  + 

VT (h  ~2)... 

vr(l)-vr(0). 

Each  of  the  paired  terms  on  the  right-hand  side  is  smaller  than  zero,  by 
Equation  (2);  thus,  we  have 

vr(h)-vT(0)<0, 

completing  the  proof. 

□ 

For  the  remainder  of  this  paper,  we  assume  stationary  probabilities 
pt  (T}  =  pj  VE.  Theorem  2  shows  that  vr(()j>v_(l  j  is  sufficient  to  prefer  Attack  at  a 

holding  time  of  r,  while  Equation  (1)  shows  that  vr  (0)>vr  (1)  is  necessary  to  prefer 
Attack  at  z  .  Therefore,  from  state  (T,  1,0)  Player  1  waits  for  k*  =  min  { vk  (0)  S  Vt(l)} 
turns  before  attacking.  Substituting  the  definition  of  vr(-),  we  can  write  this  as 
k*  =  rninjaj  (k  +  \)-ax  (k)<p2a2  (l)|  .  The  set  in  the  definition  of  k*  is  never  empty 

when  aj(-)is  bounded,  concave,  and  nondecreasing,  and  p2a2{^)  is  not  identically  zero, 
meaning  that  Player  1  will  eventually  prefer  to  attack.  We  conclude  that: 

F(r,i,o)=v„(r).  (4) 

While  we  presume  that  most  cases  will  have  nondecreasing  al,a2,pl,p2  functions, 

there  is  no  reason  that  it  must  be  so.  Nondecreasing  functions  model  situations  where  the 
passage  of  time  brings  increased  capability,  both  in  development  and  detection.  However, 
there  may  be  interesting,  and  operationally  relevant,  cases  where  the  functions  are 
decreasing.  Although  we  do  not  present  detailed  results  here,  the  value  functions  in  these 
alternate  situations  may  be  evaluated  directly  by  using  Equations  (1)  and  (2). 

4.  Neither  Player  Has  the  Exploit 

In  this  case,  the  game  has  been  in  play  for  an  unknown  amount  of  time  and 
r,  =  r2  =  0  ;  therefore,  both  players  have  singleton  action  sets, 

0,={W\ 

A=M- 
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Using  the  theory  previously  developed,  the  value  of  the  game,  given  that  Player  1 
discovers  the  exploit  first,  is:  V  (T,  1,0)  .  Similarly,  if  Player  2  discovers  the  exploit  first, 

the  value  is:  V  (T,  0,l)  .  In  the  case  where  both  players  simultaneously  discover  the 
exploit:  V =  (l)-a2  (l) .  Because  the  state  (7\0,0)  transitions  into  previously 

analyzed  states,  we  are  only  concerned  with  the  first  transition.  For  stationary  discovery 
probabilities,  the  next  state  transition  probabilities  out  of  S  =  (T,  0, 0)  are: 


Prjnext  state  is 
Prjnext  state  is 
Prjnext  state  is 


b,i,o)}=r,.„ 

h,o,i)}=rM 

h.U>!=n,r 


_ PiQ-Pi) _ 

PiQ- P2)  +  Pi(l~  Pi)  +  P1P2 

. _ p20--Pi) _ 

PiQ- p2)  +  Pi(l~  Pi)+ P1P2 

_ Mi _ 

A(!“ Pi)  +  P2(l~ Pi)  +  P1P2  ’ 


where  we  have  introduced  the  y  values  for  brevity. 


The  value  of  the  game  starting  from  (T,  0, 0)  is 


V(T,  0,0)=  yl0F  {T,  1, 0)  -  y0/  (T,  0,  l)  +  yuV  (T,  1,  l) 
^o,ivo  (K) + Yu  (fli  I1)  -  a2  (O)  ’ 


(5) 


where  the  negative  sign  comes  from  the  fact  that  Player  1  is  a  maximizing  player  and 
Player  2  is  a  minimizing  player,  v\(-),kl denotes  the  results  of  Equations  (3)  and  (4)  if 

Player  1  is  the  first  to  discover  the  exploit,  while  vl(-),k*2  denotes  the  results  of  Equations 
(3)  and  (4)  if  Player  2  is  the  first  to  discover  the  exploit. 
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IV.  NUMERICAL  ANALYSIS 


In  this  section,  we  consider  some  concrete  examples  of  the  theory  developed  in 
the  previous  section.  Unless  otherwise  specified,  we  assume  pt(T)  =  pj  VT  and  /r  #  0 . 

As  a  notational  convenience,  we  denote  the  value  of  any  particular  example  as  V", 
where  n  is  the  example  number. 

A.  SCENARIO  1:  CONSTANT  a,  FUNCTIONS 

Suppose  that  Players  1  and  2  both  have  attack  value  functions  such  that: 

a;(0)  =  0 

ai(r)  =  c(.  V  r  >  1 

Because  ai( r)  is  concave  and  increasing  for  both  players,  we  can  use  Theorem  2 
to  compute  the  optimal  attack  time  for  each  player,  k*  for  i  =  1, 2 ,  which  is  1  for  both 
players.  We  may  directly  compute  the  value  of  the  game  using  Equation  (5): 

vi  _  A>i  C1  ~  A>2 )^i  (!)  -  /^2  C1  ~  /^i )^2  (!)  +  («i  (!)  ~  (!)) 

AC1"  P2)  +  Pi(l~  A)  +  P1P2 

In  particular,  Player  1  has  a  positive  expected  payoff  if  and  only  if: 

P\a\{\)>  p2ci2{\). 

In  this  case,  a  player  may  make  up  for  a  deficiency  in  either  discovery  or 
development  by  being  strong  in  the  other  area.  Because  0  <  p,  <  1,  these  trade-offs  are 
implicitly  limited. 

B.  SCENARIO  2:  LINEARLY  INCREASING  a, 


Suppose  Players  1  and  2  have  attack  functions  such  that: 

a,(0)  =  0 

<2j(r)  =  t  1  <  r  <  5 
a, (r)  =  5  Vr>5 
a2(r2)  =c  Vr2>l_ 

This  function  is  also  concave  and  increasing,  and  we  may  use  Theorem  2  to 
determine  the  optimal  attack  time,  k* ,  for  both  players.  Specifically,  kl  =  1  and  k*  is 
dependent  on  the  values  of  p2  and  c  as  follows: 


jl  if Ac  - 1 

[5  otherwise 
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As  verification,  we  compute  the  values  of  vT(h )  for  h  =  1,2. ..5  .  We  see  in 
Figure  2  that  the  maximizing  value  is  h  =  5 .  For  example,  if  a2{  1)  =  1  ,p2=  0.2. 

3 
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|  1.5 
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> 

1 


0.5 

1  2  3  4  5  6  7 

turns  to  wait,  h 

Figure  2.  Value  of  Scenario  2  from  Player  l’s  point  of  view.  The  vertical 
axis  plots  the  value,  vT(h) ,  as  a  function  of  the  number  of  time  periods 
Player  1  waits  before  attacking,  h  .  The  value  function  increases  to  the 
point  h-  5 ,  and  decreases  afterward.  By  Theorem  2,  this  implies  that 
Player  l’s  optimal  attack  time,  k* ,  is  5. 

Knowing  k  for  both  players,  we  may  compute  the  value  of  the  game,  V2(T,  0,0) 
as  a  function  of  px ;  see  Figure  3. 
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Figure  3.  Value  of  Scenario  2  as  a  function  of  Player  l’s  probability  of 
discovering  the  Exploit,  p{ .  Here  we  see  that  the  value  of  the  game  is  a 

concave  function  of  Player  l’s  probability  of  detecting  the  exploit. 
Increases  in  detection  probability  at  low  detection  values  provide  a  bigger 
increase  in  the  game  value  than  increases  in  detection  probability  at  high 
detection  values. 

C.  SCENARIO  3:  NONMONOTONE  ax 

Suppose  that  a2(l)  =  l,p2  =  0.3 ,  and  Player  l’s  value  function  has  a  single  dip, 
specifically  a,  (t)  =  (1,  2,  3,  4, 5,  3,  6,),  as  shown  in  Figure  4.  In  this  case,  we  cannot  use 

Theorem  2  to  compute  the  optimal  attack  time.  However,  we  may  compute  the  optimal 
attack  time  directly,  by  computing  the  value  of  holding  for  each  possible  holding  period, 
as  depicted  in  Figure  5. 
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Figure  4.  The  function  ax  for  Scenario  3.  Unlike  our  previous  examples,  the 
value  of  Player  l’s  attack  has  a  dip  at  r,  =  6  .  In  this  scenario,  Theorem  2 
no  longer  applies  in  finding  the  optimal  attack  time,  k\ . 

Because  ax{z)  is  not  concave  and  increasing,  we  cannot  apply  to  Theorem  2. 
Here  we  need  to  actually  compute  the  numeric  values  of  vT(h)  .  Performing  this 
calculation,  we  see  that  k*  =  5  and  it  is  not  advisable  to  wait  through  the  nonincreasing 
region. 
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Figure  5.  Player  l’s  value  as  a  function  of  waiting  time,  h  in  Scenario  3.  We 
see  that  the  payoff  for  waiting  to  h  =  7  is  less  than  executing  at  h  =  5 . 
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A  decision  maker  may  want  to  know  what  value  of  a,  (7)  would  change 
Player  l’s  decision?  We  answer  this  question  by  performing  a  line  search  on  aA (7)  and 
determine  the  threshold  value  is  «  6.6 . 
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V.  EXTENSIONS  AND  APPLICATIONS 


In  this  section,  we  explore  the  operationally  relevant  implications  of  our  model. 

A.  DELAYED  ACTION 

It  may  be  the  case  that  a  player  discovers  an  exploit  and  cannot  take  action; 
specifically,  he  is  unable  (or  not  allowed)  to  attack,  patch,  or  work  towards  development 
of  a  munition  for  some  predetermined  fixed  time  after  discovery  of  an  exploit.  This  may 
be  due  to  legal,  policy,  or  organizational  limitations. 

1.  One  Player  Delayed  Action 

Suppose  Player  1  has  a  rule  where  he  must  wait  w  time  periods  after  discovery 
before  any  attack,  patch,  or  development  of  a  munition.  Consistent  with  our  previous 
definition  of  perfect  information,  if  Player  2  has  the  exploit,  he  learns  if  Player  1  knows 
the  exploit.  Player  2  also  knows  the  existence  and  duration  of  Player  l’s  delay  rule. 

We  wish  to  understand  the  value  of  this  delayed  version  of  our  game,  which  we 
denote  as  Vw(-) . 

If  both  players  have  the  exploit,  Player  2  can  wait  and  exercise  his  munition  the 
turn  before  Player  1  is  able  to  begin  work;  therefore, 

Vw{T,l,l)=-a2(w-l). 

If  Player  2  has  the  exploit  and  Player  1  does  not,  Player  2  may  continue 
developing  his  munition  until  Player  1  discovers  the  exploit,  and  an  additional  (w-1) 
time  periods  before  attacking;  therefore, 

oo 

Fw(r,o,i)=-XA(i-A)'«20'  +  4 

i= 0 

Finally,  if  Player  1  has  the  exploit  and  Player  2  does  not,  there  are  two 
possibilities.  First,  Player  1  may  retain  sole  knowledge  of  the  exploit  until  the  end  of  the 
waiting  period,  or,  second,  Player  2  may  discover  the  exploit  during  Player  l’s  forced 
delay  time;  therefore, 

w— 1 

V'v(T,l,0)=(l-p2yv(T,l,0)-^p2(l-p2)a2(w-i)  . 
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We  may  combine  these  expressions  to  write: 


r<r,o,o>=r,,0 


W- 1 


{}- Pi)"  v {T Pi)ai{w-i) 


1=1 


■n,  i 


Xa(1-/?i)'  «2(z'+^) 


1=0 


-7i,i  «2(w-l). 


(6) 


The  implication  of  this  is  that  unproductive  waiting  times  are  damaging  to  a 
player’s  prospects  in  cyber  conflict. 

Consider  the  specific  example  of  two  evenly  matched  players  with  bounded, 
linear  development  functions;  thus:  pl=p2=  0.1,  ax  (r)  =  a2  (r)  =  r  for  0<r<10  and 

ax\T^  =  a2(r)  =  10  for  r>10.  By  symmetry,  V(T, 0,0)  =  0  for  this  game  when  neither 
player  is  forced  to  wait. 

Now  consider  the  case  where  Player  1  has  a  waiting  time,  w  .  We  plot  Player  l’s 
expected  payoff  as  a  function  of  w  in  Figure  6. 


o 

Q. 


Waiting  time,  w 


Figure  6.  Player  1  ’s  utility  curve  as  a  function  of  waiting  time,  vv,  against  an 
evenly  matched  opponent.  We  see  that  Player  l’s  utility  drops  off  rapidly 
from  an  expected  value  of  zero,  with  the  implication  that  waiting  is  costly. 

We  can  also  ask  “Flow  good  does  Player  1  ’s  detection  probability  px  need  to  be  in 
order  to  make  up  for  a  given  waiting  time  w  ?”  Figure  2  shows  the  adjustment  required  in 
this  example;  for  waiting  times  longer  than  five  periods,  even  perfect  detection  does  not 
achieve  parity. 
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Figure  7.  Player  l’s  required  detection  probability,  p, ,  required  to  achieve 
Fw(0,0,0)  =  0  as  a  function  of  waiting  time,  w  .  Player  l’s  required 
capability  increases  rapidly  and,  because  p1  may  never  be  greater  than  1 , 
parity  is  unachievable  after  w  =  9. 

The  lesson  of  Figures  7  and  8  is  that  waiting  times  are  costly  and  adversely  affect 
one’s  prospects  in  cyber  conflict. 

B.  DETERRENCE 

In  the  preceding  subsection,  we  advise  belligerents  in  cyber  conflict  to  develop 
and  execute  their  attacks  quickly — a  stance  that  is  incompatible  with  the  notion  of  “crisis 
stability”  (Kent  &  Thayler,  1989)  of  classical  deterrence  theory.  Can  deterrence  in  cyber 
conflict  be  achieved  and,  if  so,  how?  Several  scholars  ask  this  question,  notably  (Sterner, 
2011).  In  this  paper,  we  consider  one  aspect  of  cyber  deterrence. 

1.  A  Short  Review  of  Strike  Stability 

The  concept  of  strike  stability  was  developed  during  the  Cold  War  to  understand 
which  sets  of  circumstances  would  lead  to  nuclear  conflict.  The  original  papers  describe 
the  development  and  application  of  this  theory  to  nuclear  arms.  Kent  and  Thayler  (1989) 
describe  a  game  that  has  many  similarities  with  the  one  described  herein;  two  players  are 
faced  with  the  decision  of  “attacking”  or  “not  attacking.”  They  make  this  decision  by 
weighing  the  benefits  of  going  “first”  or  “second,”  with  the  assumption  that  the  other 
player  will  surely  retaliate  with  whatever  force  he  has  left.  The  closer  the  ratio  of  costs  of 
going  second  to  going  first  is  to  one,  the  more  stable  the  system  is  because  the  decision 
maker  is  indifferent  to  striking  first  or  striking  second  and  may  be  deterred.  Low  values 
of  strike  stability  indicate  a  large  disadvantage  to  attacking  second  and  therefore  lead  to 
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instability.  Deterrence  requires  both  sides  to  choose  non-action  (“Wait”  in  our  model)  at 
each  decision  epoch. 

2.  Strike  Stability  for  Cyber  Conflict 

The  analysis  of  Section  4  shows  that  if  a  player  has  the  ability  to  attack,  he 
eventually  does  with  certainty.  This  means  that  cyber  conflict  with  perfect  information 
and  a  single  exploit  is  deterrence  unstable.  Intuitively,  this  is  because  there  is  no  second 
strike.  Theorem  1  is  sufficient  to  demonstrate  that  the  single-attack  case  is  deterrence 
unstable;  the  first  player  to  attack  receives  the  reward  of  his  development  to  date,  and  the 
nonattacking  player  is  left  with  an  empty  arsenal. 

Other  considerations  may  provide  some  degree  of  deterrence  in  reality.  For 
example,  military,  economic,  or  diplomatic  consequences,  or  large  cyber  munition 
arsenals,  may  provide  some  guarantee  of  a  second  strike.  Such  guarantees,  while 
important  to  deterrence,  are  outside  the  bounds  of  our  current  work.  Nevertheless, 
without  these  external  guarantees,  deterrence  in  cyber  conflict  does  not  exist. 
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VI.  CONCLUSIONS  AND  FUTURE  WORK 


We  have  developed  and  exercised  a  limited,  stylized  model.  Real  situations,  of 
course,  have  many  differences  from  the  idealized  mathematics;  the  utility  of  this  work  is 
to  define  the  cyber  conflict  problem  with  perfect  information.  Additionally,  we: 

•  Demonstrate  a  framework  for  analyzing  the  problem; 

•  Demonstrate  that  in  cyber  conflict,  idle  wait  times  are  damaging,  and 
provide  a  means  to  calculate  their  disutility;  and 

•  Show  implications  for  deterrence  in  cyber  conflict. 

This  paper  considered  a  single  attack  in  discrete  time  with  perfect  information — 
three  idealizations  that  help  us  begin  to  tackle  the  problem  of  cyber  conflict.  Of  these 
three,  the  perfect  information  assumption  appears  to  be  the  richest  area  to  explore  in  the 
future,  and  with  this  exploration  come  considerations  of  credibility,  reputations,  and 
risk  taking.  Also  ripe  for  future  work  is  consideration  of  cases  with  multiple  attacks. 
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